Oracle Big Data SQL 3.0 can connect Oracle Database to the Hadoop environment on Oracle Big Data Appliance, other systems based on CDH (Cloudera's Distribution including Apache Hadoop), HDP (Hortonworks Data Platform), and potentially other non-CDH Hadoop systems.
The procedures for installing Oracle Big Data SQL in these environments differ. To install the product in your particular environment, see the appropriate section:
Installing On Oracle Big Data Appliance and the Oracle Exadata Database Machine
See this section for installation on Oracle Big Data Appliance and Exadata servers only.
Installing Oracle Big Data SQL on Other Hadoop Systems
See this section for installation on both CDH (excluding Oracle Big Data Appliance) and non-CDH (specifically, HDP) systems.
See the Oracle Big Data SQL Master Compatibility Matrix (Doc ID 2119369.1) in My Oracle Support for up-to-date information on Big Data SQL compatibility with the following:
Oracle Engineered Systems.
Other systems.
Linux OS distributions and versions.
Hadoop distributions.
Oracle Database releases, including required patches.
To use Oracle Big Data SQL on an Oracle Exadata Database Machine connected to Oracle Big Data Appliance, you must install the Oracle Big Data SQL software on both systems.
Follow these steps to install the Oracle Big Data SQL software on Oracle Big Data Appliance and Oracle Exadata Database Machine.
Note:
This procedure is not applicable to the installation of Oracle Big Data SQL on systems other than Oracle Big Data Appliance and Oracle Exadata Database Machine.
The January 2016 Bundle Patch (12.1.0.2.160119 BP) for Oracle Database must be pre-installed on the Exadata Database Machine. Earlier Bundle Patches are not supported at this time.
You can use Cloudera Manager to verify that Oracle Big Data SQL is up and running.
When you are done, if the cluster is secured by Kerberos then there are additional steps you must perform on both the cluster nodes and on the Oracle Exadata Database Machine. See Enabling Oracle Big Data SQL Access to a Kerberized Cluster.
In the case of an Oracle Big Data Appliance upgrade, the customer is responsible for upgrading the Oracle Database to a supported level before re-running the post-installation script.
Important
Runbds-exa-install.sh
on every node of the Exadata cluster. If this is not done, you will see RPC connection errors when the BDS service is started.To run the Oracle Big Data SQL post-installation script:
Copy the bds-exa-install.sh
installation script from the Oracle Big Data Appliance to a temporary directory on the Oracle Exadata Database machine. (Find the script on the node where Mammoth is installed, typically the first node in the cluster.) For example:
# curl -O http://bda1node07/bda/bds-exa-install.sh
Verify the name of the Oracle installation owner and set the executable bit for this user. Typically, the oracle
user owns the installation. For example:
$ ls -l bds-exa-install.sh $ chown oracle:oinstall bds-exa-install.sh $ chmod +x bds-exa-install.sh
Set the following environment variables:
$ORACLE_HOME to <database home> $ORACLE_SID to <correct db SID> $GI_HOME to <correct grid home>
Note:
You can set the grid home with the install script as mentioned in step 5 d instead of setting the $GI_HOME as mentioned in this step.
Check that TNS_ADMIN
is pointing to the directory where the right listener.ora
is running. If the listener is in the default TNS_ADMIN
location, $ORACLE HOME/network/admin
, then there is no need to define the TNS_ADMIN
. But if the listener is in a non-default location, TNS_ADMIN must correctly point to it, using the command:
export TNS_ADMIN=<path to listener.ora>
Perform this step only if the ORACLE_SID is in uppercase, else you can proceed to the next step. This is because the install script derives the CRS database resource from ORACLE_SID, only if it is in lowercase. Perform the following sequence of steps to manually pass the SID to the script, if it is in uppercase:
Run the following command to list all the resources.
$ crsctl stat res -t
From the output note down the ora.<dbresource>.db
resource name.
Run the following command to verify whether the correct ora.<dbresource>.db
resource name is returned or not.
$ ./crsctl stat res ora.<dbresource>.db
The output displays the resource names as follows:
NAME=ora.<dbresource>.db TYPE=ora.database.type TARGET=ONLINE , ONLINE STATE=ONLINE on <name01>, ONLINE on <name02>
Specify the --db-name=<dbresource>
as additional argument to the install script as follows:
./bds-exa-install.sh --db-name=<dbresource>
Additionally, you can set the grid home instead of setting the $GI_HOME as mentioned in step 3, along with the above command as follows:
./bds-exa-install.sh --db-name=<dbresource> --grid-home=<grid home>
Note:
You can skip the next step, if you performed this step.
Run the script as any user who has dba
privileges (who can connect to sys
as sysdba
).
./bds-exa-install.sh
You must run the script as root
in another session when prompted by the script to proceed as the oracle
user. For example,
$ ./bda-exa-install.sh: bds-exa-install: root shell script : /u01/app/oracle/product/12.1.0.2/dbhome_1/install/bds-root-<cluster-name>-setup.sh please run as root: /u01/app/oracle/product/12.1.0.2/dbhome_1/install/bds-root-<rack-name>-clu-setup.sh
A sample output is shown here:
ds-exa-install: platform is Linux bds-exa-install: setup script started at : Sun Feb 14 20:06:17 PST 2016 bds-exa-install: bds version : bds-3.0-0.el6.x86_64 bds-exa-install: bda cluster name : mycluster1 bds-exa-install: bda web server : mycluster1bda16.us.oracle.com bds-exa-install: cloudera manager url : mycluster1bda18.us.oracle.com:7180 bds-exa-install: hive version : hive-1.1.0-cdh5.5.1 bds-exa-install: hadoop version : hadoop-2.6.0-cdh5.5.1 bds-exa-install: bds install date : 02/14/2016 12:00 PST bds-exa-install: bd_cell version : bd_cell-12.1.2.0.100_LINUX.X64_160131-1.x86_64 bds-exa-install: action : setup bds-exa-install: crs : true bds-exa-install: db resource : orcl bds-exa-install: database type : SINGLE bds-exa-install: cardinality : 1 bds-exa-install: root shell script : /u03/app/oracle/product/12.1.0/dbhome_1/install/bds-root-mycluster1-setup.sh please run as root: /u03/app/oracle/product/12.1.0/dbhome_1/install/bds-root-mycluster1-setup.sh waiting for root script to complete, press <enter> to continue checking.. q<enter> to quit bds-exa-install: root script seem to have succeeded, continuing with setup bds bds-exa-install: working directory : /u03/app/oracle/product/12.1.0/dbhome_1/install bds-exa-install: downloading JDK bds-exa-install: working directory : /u03/app/oracle/product/12.1.0/dbhome_1/install bds-exa-install: installing JDK tarball bds-exa-install: working directory : /u03/app/oracle/product/12.1.0/dbhome_1/bigdatasql/jdk1.8.0_66/jre/lib/security bds-exa-install: Copying JCE policy jars /bin/mkdir: cannot create directory `bigdata_config/mycluster1': File exists bds-exa-install: working directory : /u03/app/oracle/product/12.1.0/dbhome_1/bigdatasql/jlib bds-exa-install: removing old oracle bds jars if any bds-exa-install: downloading oracle bds jars bds-exa-install: installing oracle bds jars bds-exa-install: working directory : /u03/app/oracle/product/12.1.0/dbhome_1/bigdatasql bds-exa-install: downloading : hadoop-2.6.0-cdh5.5.1.tar.gz bds-exa-install: downloading : hive-1.1.0-cdh5.5.1.tar.gz bds-exa-install: unpacking : hadoop-2.6.0-cdh5.5.1.tar.gz bds-exa-install: unpacking : hive-1.1.0-cdh5.5.1.tar.gz bds-exa-install: working directory : /u03/app/oracle/product/12.1.0/dbhome_1/bigdatasql/hadoop-2.6.0-cdh5.5.1/lib bds-exa-install: downloading : cdh-ol6-native.tar.gz bds-exa-install: creating /u03/app/oracle/product/12.1.0/dbhome_1/bigdatasql/hadoop_mycluster1.env for hdfs/mapred client access bds-exa-install: working directory : /u03/app/oracle/product/12.1.0/dbhome_1/bigdatasql bds-exa-install: creating bds property files bds-exa-install: working directory : /u03/app/oracle/product/12.1.0/dbhome_1/bigdatasql/bigdata_config bds-exa-install: created bigdata.properties bds-exa-install: created bigdata-log4j.properties bds-exa-install: creating default and cluster directories needed by big data external tables bds-exa-install: note this will grant default and cluster directories to public! catcon: ALL catcon-related output will be written to /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_catcon_29579.lst catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon*.log files for output generated by scripts catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_*.lst files for spool files, if any catcon.pl: completed successfully bds-exa-install: granted default and cluster directories to public! bds-exa-install: mta set to use listener end point : EXTPROC1521 bds-exa-install: mta will be setup bds-exa-install: creating /u03/app/oracle/product/12.1.0/dbhome_1/hs/admin/initbds_orcl_mycluster1.ora bds-exa-install: mta setting agent home as : /u03/app/oracle/product/12.1.0/dbhome_1/hs/admin bds-exa-install: mta shutdown : bds_orcl_mycluster1 bds-exa-install: registering crs resource : bds_orcl_mycluster1 bds-exa-install: using dependency db resource of orcl bds-exa-install: starting crs resource : bds_orcl_mycluster1 CRS-2672: Attempting to start 'bds_orcl_mycluster1' on 'mycluster1bda09' CRS-2676: Start of 'bds_orcl_mycluster1' on 'mycluster1bda09' succeeded NAME=bds_orcl_mycluster1 TYPE=generic_application TARGET=ONLINE STATE=ONLINE on mycluster1bda09 bds-exa-install: patching view LOADER_DIR_OBJS catcon: ALL catcon-related output will be written to /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_catcon_30123.lst catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon*.log files for output generated by scripts catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_*.lst files for spool files, if any catcon.pl: completed successfully bds-exa-install: creating mta dblinks bds-exa-install: cluster name : mycluster1 bds-exa-install: extproc sid : bds_orcl_mycluster1 bds-exa-install: cdb : true catcon: ALL catcon-related output will be written to /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_dbcluster_dropdblink_catcon_30153.lst catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_dbcluster_dropdblink*.log files for output generated by scripts catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_dbcluster_dropdblink_*.lst files for spool files, if any catcon.pl: completed successfully catcon: ALL catcon-related output will be written to /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_default_dropdblink_catcon_30179.lst catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_default_dropdblink*.log files for output generated by scripts catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_default_dropdblink_*.lst files for spool files, if any catcon.pl: completed successfully catcon: ALL catcon-related output will be written to /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_dbcluster_createdblink_catcon_30205.lst catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_dbcluster_createdblink*.log files for output generated by scripts catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_dbcluster_createdblink_*.lst files for spool files, if any catcon.pl: completed successfully catcon: ALL catcon-related output will be written to /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_default_createdblink_catcon_30231.lst catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_default_createdblink*.log files for output generated by scripts catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_default_createdblink_*.lst files for spool files, if any catcon.pl: completed successfully catcon: ALL catcon-related output will be written to /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_catcon_30257.lst catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon*.log files for output generated by scripts catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_*.lst files for spool files, if any catcon.pl: completed successfully catcon: ALL catcon-related output will be written to /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_catcon_30283.lst catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon*.log files for output generated by scripts catcon: See /u03/app/oracle/product/12.1.0/dbhome_1/install/bdscatcon_*.lst files for spool files, if any catcon.pl: completed successfully bds-exa-install: setup script completed all steps
For additional details see "Running the bds-exa-install Script".
Repeat step 6 for each database instance, if you have a multi instance database.
When the script completes, the following items including Oracle Big Data SQL is available and running on the database instance. However, if events cause the Oracle Big Data SQL agent to stop, then you must restart it. See "Starting and Stopping the Big Data SQL Agent".
Oracle Big Data SQL directory and configuration with jar, and environment and properties files.
Database dba_directories.
Database dblinks.
Database big data spfile parameter.
For example, you can verify the dba_directories from the SQL prompt as follows:
SQL> select * from dba_directories where directory_name like '%BIGDATA%';
The bds-exa-install
script generates a custom installation script that is run by the owner of the Oracle home directory. That secondary script installs all the files need by Oracle Big Data SQL into the $ORACLE_HOME/bigdatasql
directory. For Oracle NoSQL Database support, it installs the client library (kvclient.jar
). It also creates the database directory objects, and the database links for the multithreaded Oracle Big Data SQL agent.
The following is the bds-exa-install
syntax:
Usage: bds-exa-install oracle-sid=<orcl> ( --version --info --root-script-only --uninstall-as-primary --uninstall-as-secondary --install-as-secondary --jdk-home=<dir> --grid-home=<dir> )* Options --version Prints script version. --info Print information such as cluster name, CM host, Oracle Big Data Appliance HTTP server. --root-script-only. Only generate the root script. --uninstall-as-primary Uninstall scaj31cdh, including hadoop client jars. Note: after this any secondary clusters needs to be reinstalled. --uninstall-as-secondary Attempt to uninstall scaj31cdh as a secondary cluster. --install-as-secondary Default = false. Do not install client libraries, etc. The primary cluster will not be affected. --jdk-home For example: /opt/oracle/bd_cell12.1.2.0.100_LINUX.X64_150912.1/jdk --grid-home Oracle Grid Infrastructure home. For example: "/opt/oracle/bd_cell12.1.2.0.100_LINUX.X64_150912.1/../grid"
In case of problems running the install script on Exadata, perform the following steps and open an SR with Oracle Support with the details:
Collect the debug output by running the script in a debug mode as follows:
$ ./bds-exa-install.sh --db-name=<dbresource> --grid-home=<grid home> --root-script=false --debug OR $ ./bds-exa-install.sh --root-script=false --debug
Collect the Oracle Database version as follows:
Collect the result of opatch lsinventory
from RDBMS-RAC Home.
Collect the result of opatch lsinventory
from Grid Home
Result of the following SQL statement to confirm that the Datapatch is set up.
SQL> select patch_id, patch_uid, version, bundle_series, bundle_id, action, status from dba_registry_sqlpatch;
Collect the information from the following environment variables:
$ORACLE_HOME
$ORACLE_SID
$GI_HOME
$TNS_ADMIN
Result of running lsnrctl status
command.
Oracle Big Data Appliance already provides numerous security features to protect data stored in a CDH cluster on Oracle Big Data Appliance:
Kerberos authentication: Requires users and client software to provide credentials before accessing the cluster.
Apache Sentry authorization: Provides fine-grained, role-based authorization to data and metadata.
HDFS Transparent Encryption: Protects the data on disk and at rest. Data encryption and decryption is transparent to applications using the data.
Oracle Audit Vault and Database Firewall monitoring: The Audit Vault plug-in on Oracle Big Data Appliance collects audit and logging data from MapReduce, HDFS, and Oozie services. You can then use Audit Vault Server to monitor these services on Oracle Big Data Appliance
Oracle Big Data SQL adds the full range of Oracle Database security features to this list. You can apply the same security policies and rules to your Hadoop data that you apply to your relational data.
In order to give Oracle Big Data SQL access to HDFS data on a Kerberos-enabled cluster, make each Oracle Exadata Database Machine that needs access a Kerberos client. Also run kinit
on the oracle
account on each cluster node and Exadata Database Machine to ensure that the account is authenticated by Kerberos. There are two situations where this procedure is required:
When enabling Oracle Big Data SQL on a Kerberos-enabled cluster.
When enabling Kerberos on a cluster where Oracle Big Data SQL is already installed.
Note:
Oracle Big Data SQL queries will run on the Hadoop cluster as the owner of the Oracle Database process (i.e. theoracle
user). Therefore, the oracle
user needs a valid Kerberos ticket in order to access data. This ticket is required for every Oracle Database instance that is accessing the cluster. A valid ticket is also need for each Big Data SQL Server process running on the Oracle Big Data Appliance. Run kinit oracle
to obtain the ticket.These steps enable the operating system user to authenticate with the kinit utility before submitting Oracle SQL Connector for HDFS jobs. The kinit utility typically uses a Kerberos keytab file for authentication without an interactive prompt for a password.
On each node of the cluster:
Log in as the oracle
user.
Run kinit
on the oracle account.
$ kinit oracle
Enter the Kerberos password.
Log on to the primary node and then stop and restart Oracle Big Data SQL.
$ bdacli stop big_data_sql_cluster $ bdacli start big_data_sql_cluster
On all Oracle Exadata Database Machines that need access to the cluster:
Copy the Kerberos configuration file /etc/krb5.conf
from the node where Mammoth is installed to the same path on each Oracle Exadata Machine.
Run kinit
on the oracle
account and enter the Kerberos password.
Re-run the Oracle Big Data SQL post-installation script
$ ./bds-exa-install.sh
Avoiding Kerberos Ticket Expiration
The system should run kinit on a regular basis, before letting the Kerberos ticket expire, to enable Oracle SQL Connector for HDFS to authenticate transparently. Use cron or a similar utility to run kinit. For example, if Kerberos tickets expire every two weeks, then set up a cron job to renew the ticket weekly.
The Big Data SQL agent on the database is managed by Oracle Clusterware. The agent is registered with Oracle Clusterware during Big Data SQL installation to automatically start and stop with the database. To check the status, you can run mtactl check
from the Oracle Grid Infrastructure home or Oracle Clusterware home:
# mtactl check bds_databasename_clustername
Oracle Big Data SQL is deployed using the services provides by the cluster management server. The installation process uses the management server API to register the service and start the deployment task. From there, the management server controls the process.
After installing Big Data SQL on the cluster management server, use the tools provided in the bundle to generate an installation package for the database server side.
You can download Oracle Big Data SQL from the Oracle Software Delivery Cloud
Table 2-1 Oracle Big Data SQL Product Bundle Inventory
File | Description |
setup-bds |
Cluster-side installation script |
bds-config.json |
Configuration file. |
api_env.sh |
Setup REST API environment script |
platform_env.sh |
BDS service configuration script |
BIGDATASQL-1.0.jar |
CSD file (in the Cloudera product bundle only) |
bin/json-select |
JSON-select utility |
db/bds-database-create-bundle.sh |
Database bundle creation script |
db/database-install.zip |
Database side installation files |
repo/BIGDATASQL-1.0.0-el6.parcel |
Parcel file (in the CDH product bundle only) |
repo/manifest.json |
Hash key for the parcel file (in the CDH product bundle only) |
BIGDATASQL-1.0.0-el6.stack |
Stack file (in the HDP product bundle only) |
setup-db.sh |
Script to acquire cluster information (Currently used in the manual portion of the HDP cluster-side installation.) |
The following are required in order to install Oracle Big Data SQL on the Hortonworks Hadoop Data Platform (HDP).
Services Running
The following services must be running at the time of the Big Data SQL installation
HDP 2.3
Ambari 2.1.0
HDFS 2.7.1
YARN 2.7.1
Zookeeper 3.4.6
Hive 1.2.1
Tez 0.7.0
Packages
The following packages must be pre-installed before installing Big Data SQL.
JDK version 1.7 or later
Python version 2.6.
OpenSSL version 1.01 build 16 or later
System Tools
curl
rpm
scp
tar
unzip
wget
yum
Environment Settings
The following environment settings are required prior to the installation.
ntp enabled
iptables disabled
Ensure that /usr/java/default
exists and is linked to the appropriate Java version. To link it to the latest Java version, perform the following as root
:
$ ln -s /usr/java/latest /usr/java/default
Access Control Settings
The following user and group must exist.
oracle
user
oinstall
group
The oracle
user must be a member of the oinstall
group.
If Oracle Big Data SQL is Already Installed
If the Ambari Web GUI shows that Big Data SQL service is already installed, make sure that all Big Data SQL Cell components are stopped before reinstalling. (Use the actions button, as with any other service.)
The following conditions must be met when installing Oracle Big Data SQL on a CDH cluster that is not part of an Oracle Big Data Appliance.
Note:
The installation prerequisites as well as the procedure for installing Oracle Big Data SQL on the Oracle Big Data Appliance are different from process used for installations on other CDH systems. See Installing On Oracle Big Data Appliance and the Oracle Exadata Database Machine if you are installing on Oracle Big Data Appliance.Services Running
The following services must be running at the time of the Oracle Big Data SQL installation
Cloudera’s Distribution including Apache Hadoop (CDH) 5.5 and higher
HDFS 2.6.0
YARN 2.6.0
Zookeeper 3.4.5
Hive 1.1.0
Packages
The following packages must be pre-installed before installing Oracle Big Data SQL. The Oracle clients are available for download on the Oracle Technology Network.
JDK version 1.7 or later
Oracle Instant Client – 12.1.0.2 or higher, e.g. oracle-instantclient12.1-basic-12.1.0.2.0-1.x86_64.rpm
Oracle Instant JDBC Client – 12.1.0.2 or higher, e.g. oracle-instantclient12.1-jdbc-12.1.0.2.0-
PERL LibXML – 1.7.0 or higher, e.g. perl-XML-LibXML-1.70-5.el6.x86_64.rpm
Apache log4j
System Tools
unzip
finger
wget
Environment Settings
The following environment settings are required prior to the installation.
Ensure that /usr/java/default
exists and is linked to the appropriate Java version. To link it to the latest Java version, perform the following as root
:
$ ln -s /usr/java/latest /usr/java/default
The path to the Java binaries must exist in /usr/java/latest
.
The default path to Hadoop libraries must be in /opt/cloudera/parcels/CDH/lib/
.
Access Control Settings
The following user and group must exist.
oracle
user
oinstall
group
The oracle
user must be a member of the oinstall
group.
Settings to Save Before the Installation
If resource management is enabled on Cloudera Manager, then before installing Big Data SQL, save YARN's resource management configuration so that it can be restored to the original state if Big Data SQL is uninstalled later.
The Oracle Big Data SQL installation consists of two stages.
Cluster-side installation:
Deploys binaries along the cluster.
Configures Linux and network settings for the service on each cluster node.
Configures the service on the management server.
Acquires cluster information for configure database connection.
Creates database bundle for the database side installation.
Oracle Database server-side installation:
Copies binaries into database node.
Configures network settings for the service.
Inserts cluster metadata into database.
The first step of the Oracle Big Data SQL installation is to run the installer on the Hadoop cluster management server (where Cloudera Manager runs on a CDH system or Ambari runs on an HDP system). As post-installation task on the management server, you then run the script that prepares the installation bundle for the database server.
Extract the files from BIGDATASQL product bundle saved from the download (either BigDataSQL-CDH-<version>.zip
or BigDataSQL-HDP-<version>.zip
) then configure and run the Oracle Big Data SQL installer found within the bundle. This installs Oracle Big Data SQL on the local server.
Run the database bundle creation script. This script generates the database bundle file that you will run on the Oracle Database server in order to install Oracle Big Data SQL there.
Check the parameters in the database bundle file and adjust as needed.
After you have checked and (if necessary) edited the database bundle file, copy it over to the Oracle Database server and run it as described in Installing on the Oracle Database Server
Install Big Data SQL on the Cluster Management Server
To install Big Data SQL on the cluster management server:
Copy the appropriate zip file (BigDataSQL-CDH-<version>.zip
or BigDataSQL-HDP-<version>.zip
) to a temporary location on the cluster management server.
Unzip file.
Change directories to either BigDataSQL-HDP-<version>
or BigDataSQL-CDH-<version>
, depending up on which platform you are working with.
Edit the configuration file.
Table 2–4 below describes the use of each configuration parameter.
For CDH, edit bds-config.json
, as in this example. Any unused port will work as the web server port.
{ "CLUSTER_NAME" : "cluster", "CSD_PATH" : "/opt/cloudera/csd", "DATABASE_IP" : "10.12.13.14/24", "REST_API_PORT" : "7180", "WEB_SERVER_PORT" : "81", }
For HDP, edit bds-config.json
as in this example:
{ "CLUSTER_NAME" : "clustername", "DATABASE_IP" : "10.10.10.10/24", "REST_API_PORT" : "8080", }
DATABASE_IP
must be the correct network interface address for the database node where you will perform the installation. You can confirm this by running /sbin/ip -o -f inet addr show
on the database node.
Obtain the cluster administrator user ID and password and then as root
run setup-bds
. Pass it the configuration file name as an argument (bds-config.json
). The script will prompt for the administrator credentials and then install BDS on the management server.
$ ./setup-bds bds-config.json
Table 2-2 Configuration Parameters for setup-bds
Configuration Parameter | Use | Applies To |
CLUSTER_NAME |
The name of the cluster on the Hadoop server. | CDH, HDP |
CSD_PATH |
Location of Custom Service Descriptor files. | CDH only |
DATABASE_IP |
The IP address of the Oracle Database server that will make connection requests. The address must include the prefix length (as in 100.112.10.36/24). Although only one IP address is specified in the configuration file, it is possible to install the database-side software on multiple database servers by using a command line parameter to override |
CDH, HDP |
REST_API_PORT |
The port where the cluster management server listens for requests. | CDH, HDP |
WEB_SERVER_PORT |
A port assigned temporarily to a repository for deployment tasks during installation. This can be any port where the assignment does not conflict with cluster operations. ` | CDH only. |
Important
Be sure that the address provided forDATABASE_IP
is the correct address of a network interface on the database server and is accessible from each DataNode of the Hadoop system, otherwise the installation will fail. You can test that the database IP replies to a ping from each DataNode. Also, currently the address string (including the prefix length) must be at least nine characters long.If the Oracle Big Data SQL Service Immediately Fails
If Ambari or Configuration Manager reports an Oracle Big Data SQL service failure immediately after service startup, do the following.
Check the cell server (CELLSRV) log on the cluster management server for the following error at the time of failure:
ossnet_create_box_handle: failed to parse ip : <IP Address>
If the IP address in the error message is less than nine characters in length, for example, 10.0.1.4/24
, then on the cluster management server, find this address in /opt/oracle/bd_cell/cellsrv/deploy/config/cellinit.ora
. Edit the string by padding one or more of the octets with leading zeros to make the total at least nine characters in length, as in:
ipaddress1=10.0.1.004/24
Restart the Oracle Big Data SQL service.
The need for this workaround will be eliminated in a subsequent Oracle Big Data SQL release.
On the cluster management server, run the database bundle creation script from the Oracle Big Data SQL download to create an installation bundle to install the product on the Oracle Database server. If some of the external resources that the script requires are not accessible from the management server, you can add them manually.
The database bundle creation script attempts to download the following:
Hadoop and Hive client tarballs from Cloudera or Hortonworks repository web site.
Configuration files for Yarn and Hive from the cluster management server, via Cloudera Manager (for the CDH versions) or Ambari (for the HDP versions).
For HDP only, HDFS and MapReduce configuration files from Ambari.
Change directories to BigDataSQL-CDH-<version>/db
or (BigDataSQL-HDP-<version>/db
).
Run the BDS database bundle creation script. See the table below for optional parameters that you can pass to the script in order to override any of the default settings.
$ bds-database-create-bundle.sh <optional parameters>
The message below is returned if the operation is successful.
bds-database-create-bundle: database bundle creation script completed all steps
The database bundle file includes a number of parameters. You can change any of these parameters as necessary. Any URLs specified must be accessible from the cluster management server at the time you run bds-database-create-bundle.sh.
Table 2-3 Command Line Parameters for bds-database-create-bundle.sh
Parameter | Value |
--hadoop-client-ws |
Specifies an URL for the Hadoop client tarball download or bypass download of this client. |
--no-hadoop-client-ws |
|
--hive-client-ws |
Specifies an URL for the Hive client tarball download or bypass download of this client. |
--no-hive-client-ws |
|
--yarn-conf-ws |
Specifies an URL for the YARN configuration zip file download or bypass this download. |
--no-yarn-conf-ws |
|
--hive-conf-ws |
Specifies an URL for the Hive configuration zip file download or bypass this download. |
--no-hive-conf-ws |
|
--ignore-missing-files |
Create the bundle file even if some files are missing. |
--clean-previous |
Deletes previous bundle files and directories from bds-database-install/ |
--script-only |
Only creates the script database installation file. |
--hdfs-conf-ws |
Specify an URL for the HDFS configuration zip file download or bypass this download (HDP only). |
--no-hdfs-conf-ws |
|
--mapreduce-conf-ws |
Specify an URL for the MapReduce configuration zip file download or bypass this download (HDP only). |
--no-mapreduce-conf-ws |
Note:
In Big Data SQL 3.0,bds-database-create-bundle.sh
does not include a command line parameter to override the default JDK (jdk-8u66-linux-x64). To include a different version of the JDK, run bds-database-create-bundle.sh
twice, as follows.
Run bds-database-create-bundle.sh
to generate some files that you will edit or replace.
Remove the existing JDK file, /bds_database_install/jdk-8u66-linux-x64.tar.gz
.
Remove the bds-database-install.zip
bundle file generated by this first run of the script.
Manually download the JDK tarball from Oracle Technology Network.. Copy it into /bds_database_install
.
Edit bds-database-install/db/create-bundle.env
. Update the $jdktar
environment variable to match the JDK you downloaded. For example: jdktar=jdk-8u77-linux-x64.tar.gz
Run bds-database-create-bundle.sh
again to generate a new database bundle.
Manually Adding Resources if Download Sites are not Accessible to the BDS Database Bundle Creation Script
If one or more of the default download sites is inaccessible from the cluster management server, there are two ways around this problem:
Download the files from another server first and then provide bds-database-create-bundle.sh
with the alternate path as an argument. For example:
$ ./bds-database-create-bundle.sh --yarn-conf-ws='http://nodexample:1234/config/yarn'
Because the script will first search locally in /bds-database-install
for resources, you can download the files to another server, move the files into /bds-database-install
on the cluster management server and then run the bundle creation script with no additional argument. For example:
$ cp hadoop-xxxx.tar.gz bds-database-install/ $ cp hive-xxxx.tar.gz bds-database-install/ $ cp yarn-conf.zip bds-database-install/ $ cp hive-conf.zip bds-database-install/ $ cd db $ ./bds-database-create-bundle.sh
Copying the Database Bundle to the Oracle Database Server
Use scp
to copy the database bundle you created to the Oracle Database server. In the example below, dbnode
is the database server. The Linux account and target directory here are arbitrary. Use any account authorized to scp
to the specified path.
$ scp bds-database-install.zip oracle@dbnode:/home/oracle
The next step is to log on to the Oracle Database server and install the bundle.
Oracle Big Data SQL must be installed on both the Hadoop cluster management server and the Oracle Database server. This section describes the database server installation.
Prerequisites for Installing on an Oracle Database Server
The information in this section does not apply to the installation of Oracle Big Data SQL on an Oracle Exadata Database Machine connected Oracle Big Data Appliance.
Important
For multi-node databases, you must repeat this installation on every node of the database. For each node, you may need to modify the DATABASE_IP
parameter of the installation bundle in order to identify the correct network interface. This is described in the section, If You Need to Change the Configured Database_IP Address
Required Software
See the Oracle Big Data SQL Master Compatibility Matrix (Doc ID 2119369.1) in My Oracle Support for supported Linux distributions, Oracle Database release levels, and required patches.
Note:
Be sure that the correct Bundle Patch and one-off patch have been pre-applied before starting this installation. Earlier Bundle patches are not supported for use Big Data SQL 3.0 at this time.Recommended Network Connections to the Hadoop Cluster
Oracle recommends Ethernet connections between Oracle Database and the Hadoop cluster of 10Gb/s Ethernet.
Extract and Run the Big Data SQL Installation Script
Perform the procedure in this section as the oracle
user, except where sudo
is indicated.
Check that /etc/oracle/cell/network-config/cellinit.ora
exists. If not, do the following to create it and add the database server IP address:
Create the network-config directory and set permissions:
$ sudo mkdir -p /etc/oracle/cell/network-config/ $ sudo chown oracle:dba /etc/oracle/cell/network-config $ sudo chmod ug+wx /etc/oracle/cell/network-config
Find the private IP address of the database server (through inet addr
or other means)
The address must include the prefix length, as in 100.112.10.36/24.
Create cellinit.ora
under network-config
and add the IP address as the value of ipaddress1
as shown below. Also be sure to include the two lines that follow.
ipaddress1=100.112.10.36/24 _skgxp_ant_options=1 _skgxp_dynamic_protocol=2
Locate the database bundle zip file that you copied over from the cluster management server.
Unzip the bundle into a temporary directory.
Change directories to bds-database-install
, which was extracted from the zip file.
Run bds-database-install.sh
. Note the optional parameters listed in Table 2-4
Table 2-4 Optional Parameters for bds-database-install.sh
Parameter | Function |
--version |
Show the bds-database-install.sh script version. |
--info |
Show information about the cluster. |
--ip-cell |
Set a particular IP address for db_cell process. See If You Need to Change the Configured Database_IP Address below |
--install-as-secondary |
Specify secondary cluster installation. |
--uninstall-as-primary |
Uninstall Oracle Big Data SQL from the primary cluster. |
--uninstall-as-secondary |
Uninstall Oracle Big Data SQL from a secondary cluster. |
--jdk-home |
Specify the JDK home directory. |
--grid-home |
Specify the Grid home directory. |
--db-name |
Specify the Oracle Database SID. |
--debug |
Activate shell trace mode. If you report a problem, Oracle Support may want to see this output. |
If You Need to Change the Configured Database_IP Address
The DATABASE_IP
parameter in the bds-config.json
file identifies the network interface of the database node. If you run bds-database-install.sh
with no parameter passed in, it will search for that IP address (with that length, specifically) among the available network interfaces. You can pass the ––ip-cell
parameter to bds-database-install.sh
in order to override the configured DATABASE_IP
setting:
$ ./bds-database-install.sh --ip-cell=10.20.30.40/24
Possible reasons for doing this are:
bds-database-install.sh
terminates with an error. The configured IP address (or length) may be wrong.
There is an additional database node in the cluster and the defined DATABASE_IP
address is not a network interface of the current node.
The connection is to a multi-node database. In this case, perform the installation on each database node. On each node, use the ––ip-cell
parameter to set the correct DATABASE_IP
value.
ip-cell
, you can use list all network interfaces on a node as follows:
/sbin/ip -o -f inet addr show
The steps for uninstalling Oracle Big Data SQL from HDP and from CDH systems are different.
Uninstalling the Software from an HDP Hadoop Cluster
In the Ambari web interface, stop the Big Data SQL service. All components on all DataNodes must be stopped.
On the Ambari command line, delete the Big Data SQL service using a REST API call.
curl --user admin:admin -H 'X-Requested-By:<user>' -X DELETE http://<ambari_server_fqdn>:<rest_api_port>/api/v1/clusters/<cluster_name>/services/BIGDATASQL
On each DataNode, find and kill any Oracle Big Data SQL processes that are running.
# ps -fea | grep bds # kill -9 <pid>
On the Ambari command line, remove the BIGDATASQL stack from the services.
# rm -rf /var/lib/ambari-server/resources/stacks/HDP/<version>/services/BIGDATASQL
On each DataNode, remove the bd_cell
RPM.
# yum remove -y bd_cell
On all DataNodes, remove the following directories.
# rm -rf /opt/oracle/bd_cell # rm -rf /opt/oracle/bigdatasql # rm -rf /tmp/bigdatasql # rm -rf /var/log/oracle
On the Ambari command line, restart Ambari.
# ambari-server restart
Uninstalling Oracle Big Data SQL From a CDH Hadoop Cluster
To uninstall Big Data SQL from a CDH cluster (that is not hosted on Oracle Big Data Appliance), follow these steps:
In the Cloudera Manager GUI, do the following:
Stop the Big Data SQL service. All instances on all DataNodes must be stopped.
Delete the service from the cluster.
Deactivate and remove the parcel from all hosts.
Delete the parcel.
If resource management was not enabled on Cloudera Manager before Big Data SQL installation, disable the option Cgroup-based Resource Management and restart the YARN service.
bds
processes.
# ps -fea | grep bds # kill -9 <pid>
On the Cloudera Manager command line, remove the Big Data SQL JAR from the csd
directory. (The default location is /opt/cloudera/csd
, but this may differ.)
# rm -f /opt/cloudera/csd/BIGDATASQL-1.0.jar
On each DataNode, do the following:
Remove the bd_cell
RPM.
# yum remove -y bd_cell
Remove all Oracle Big Data SQL directories.
# rm -rf /opt/oracle/bd_cell # rm -rf /opt/oracle/bigdatasql # rm -rf /tmp/bigdatasql # rm -rf /var/log/oracle
Procedures for securing Oracle Big Data SQL on Hortonworks HDP and on CDH-based systems other than Oracle Big Data Appliance are not covered in this version of the guide. Please review the MOS documents referenced in this section for more information.
Please refer to MOS Document 2123125.1 at My Oracle Support for guidelines on securing Hadoop clusters for use with Big Data SQL.